Robust CNN-based speech recognition with Gabor filter kernels

نویسندگان

Shuo-Yiin Chang

Nelson Morgan

چکیده

As has been extensively shown, acoustic features for speech recognition can be learned from neural networks with multiple hidden layers. However, the learned transformations may not sufficiently generalize to test sets that have a significant mismatch to the training data. Gabor features, on the other hand, are generated from spectro-temporal filters designed to model human auditory processing. In previous work, these features are used as inputs to neural networks, which improved word accuracy for speech recognition in the presence of noise. Here we propose a neural network architecture called a Gabor Convolutional Neural Network (GCNN) that incorporates Gabor functions into convolutional filter kernels. In this architecture, a variety of Gabor features served as the multiple feature maps of the convolutional layer. The filter coefficients are further tuned by back-propagation training. Experiments used two noisy versions of the WSJ corpus: Aurora 4, and RATS re-noised WSJ. In both cases, the proposed architecture performs better than other noise-robust features that we have tried, namely, ETSI-AFE, PNCC, Gabor features without the CNN-based approach, and our best neural network features that don’t incorporate Gabor functions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Handwritten Character Recognition Using CNN Gabor-Type Filters

This paper proposes an approach for handwritten character recognition using nonlinear normalisation, a CNN Gabor-Type filter, a Location Based Dominant Orientation Map and cross correlation. Based on a test set of 26 test characters acting as template and a set consisting of 4 sets of 26 unknown handwritten test characters, max. 92 % correct recognition is provided. Recognition rate is studied ...

متن کامل

Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

To test if simultaneous spectral and temporal processing is required to extract robust features for automatic speech recognition (ASR), the robust spectro-temporal two-dimensional-Gabor filter bank (GBFB) front-end from Schädler, Meyer, and Kollmeier [J. Acoust. Soc. Am. 131, 4134-4151 (2012)] was de-composed into a spectral one-dimensional-Gabor filter bank and a temporal one-dimensional-Gabor...

متن کامل

Feeding Hand-Crafted Features for Enhancing the Performance of Convolutional Neural Networks

Since the convolutional neural network (CNN) is believed to find right features for a given problem, the study of hand-crafted features is somewhat neglected these days. In this paper, we show that finding an appropriate feature for the given problem may be still important as they can enhance the performance of CNN-based algorithms. Specifically, we show that feeding an appropriate feature to t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Robust CNN-based speech recognition with Gabor filter kernels

نویسندگان

چکیده

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Handwritten Character Recognition Using CNN Gabor-Type Filters

Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

Feeding Hand-Crafted Features for Enhancing the Performance of Convolutional Neural Networks

عنوان ژورنال:

اشتراک گذاری